Crowdsourced Data Preprocessing with R and Amazon Mechanical Turk

نویسندگان

چکیده

منابع مشابه

Active Learning with Amazon Mechanical Turk

Supervised classification needs large amounts of annotated training data that is expensive to create. Two approaches that reduce the cost of annotation are active learning and crowdsourcing. However, these two approaches have not been combined successfully to date. We evaluate the utility of active learning in crowdsourcing on two tasks, named entity recognition and sentiment detection, and sho...

متن کامل

Collecting Psycholinguistic Response Time Data Using Amazon Mechanical Turk

Researchers in linguistics and related fields have recently begun exploiting online crowd-sourcing tools, like Amazon Mechanical Turk (AMT), to gather behavioral data. While this method has been successfully validated for various offline measures--grammaticality judgment or other forced-choice tasks--its use for mainstream psycholinguistic research remains limited. This is because psycholinguis...

متن کامل

The Language Demographics of Amazon Mechanical Turk

We present a large scale study of the languages spoken by bilingual workers on Mechanical Turk (MTurk). We establish a methodology for determining the language skills of anonymous crowd workers that is more robust than simple surveying. We validate workers’ selfreported language skill claims by measuring their ability to correctly translate words, and by geolocating workers to see if they resid...

متن کامل

Using Amazon Mechanical Turk for linguistic research1

Amazon’s Mechanical Turk service makes linguistic experimentation quick, easy, and inexpensive. However, researchers have not been certain about its reliability. In a series of experiments, this paper compares data collected via Mechanical Turk to those obtained using more traditional methods One set of experiments measured the predictability of words in sentences using the Cloze sentence compl...

متن کامل

Clustering dictionary definitions using Amazon Mechanical Turk

Vocabulary tutors need word sense disambiguation (WSD) in order to provide exercises and assessments that match the sense of words being taught. Using expert annotators to build a WSD training set for all the words supported would be too expensive. Crowdsourcing that task seems to be a good solution. However, a first required step is to define what the possible sense labels to assign to word oc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The R Journal

سال: 2016

ISSN: 2073-4859

DOI: 10.32614/rj-2016-020